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We argue that quantum optimal control can and should be done with analytic control functions, 
in the vast majority of applications. First, we show that discretizing continuous control functions as 
piecewise-constant functions prevents high accuracy optimization at reasonable computational costs. 
Second, we argue that the number of control parameters required is on-par with the dimension of 
the object manipulated, and therefore one may choose parametrization by other considerations, e.g. 
experimental suitability and the potential for physical insight into the optimized pulse. Third, we 
note that optimal control algorithms which make use of the gradient of the goal function with respect 
to control parameters are generally faster and reach higher final accuracies than non gradient-based 
methods. Thus, if the gradient can be efficiently computed, it should be used. Fourth, we present 
a novel way of computing the gradient based on an equation of motion for the gradient, which we 
evolve in time by the Taylor expansion of the propagator. This allows one to calculate any physically 
relevant analytic controls to arbitrarily high precision. The combination of the above techniques is 
GOAT (Gradient Optimization of Analytic conTrols) - gradient-based optimal control for analytic 
control functions, utilizing exact evolution in time of the derivative of the propagator with respect 
to arbitrary control parameters. 


I. INTRODUCTION 

The ability to drive a quantum system to a desired target 
in a fast and efficient manner is at the heart of emerging 
quantum technologies. The task of finding the optimal 
control pulse to transfer the quantum system to a desired 
state or to generate a desired quantum gate, has been the 
subject of extensive research since the first applications 
of optimal control algorithms [H-Q. Quantum optimal 
control was originally applied to chemical applications 
0, where progress has continued to this day [g. It has 
also been applied to nuclear magnetic resonance [3, 0, 
with applications to medical imaging and spectroscopy. 
Over the years, the birth of new experimental methods 
to control quantum systems has led to an increase in 
the interest in optimal control, which has been applied 
to processes as diverse as high harmonics generation 
control of energy flow in biomolecul es llOll. attosecond 
physics Ea and quantum computing |l2l Il3l|. 

Recently, the precision of experimental quantum control 
has drastically improved, driven primarily by the require¬ 
ments for quantum computing. Quantum circuits based 
on Josephson junctions can now approach 10“^ error 0 
and ion trap architecture can reach comparable precision 
0. Since an even higher accuracy may be necessary 
for fault tolerant quantum computing |16l Il7l | , one can 
expect further refinement of the experimental techniques 
in the near future. 

However, no current quantum control algorithm can effi¬ 
ciently achieve very high accuracy, i.e. error below 10“®. 
The barriers are two-fold. First is the use of a piece¬ 


wise constant (PWC) approximation to smooth control 
pulses. Quantum control algorithms such as Krotov Q 
and GRAPE Q rely on this approximation, which either 
compromises accuracy or requires egregious computing 
resources to achieve high accuracy, as will be shown in 
the following section. The second barrier is the inherent 
difficulty of non-gradient based methods to obtain high 
accuracy. The term non-gradient refers here to genetic 
algorithms and simplex methods, e.g. the Nelder-Mead 
algorithm used in the CRAB approach [0. Thus we 
argue that if the gradient information can be obtained 
efficiently, it should be used. 

Moreover, quantum optimal control is, at its core, a pow¬ 
erful framework for designing pulses for use in exper¬ 
iments. Hence, it should endeavor to provide control 
pulses which have a straightforward physical interpreta¬ 
tion, and are easy to implement experimentally. The 
common approach is to define an ansatz characterized 
by a few parameters which can be tuned in the experi¬ 
ments [0. The DRAG method (2^, for example, was 
developed for this purpose and is currently used in state 
of the art experiments on superconducting qubit systems 
(^ . However, current optimal control algorithms are 
not designed to work efficiently when the pulse has an 
analytic parametrization. 

To answer these challenges, we present a new approach, 
Gradient Optimization of Analytical conTrols (GOAT), 
that avoids the use of a piecewise constant approxima¬ 
tion. It is based on formulating an equation of motion for 
the gradient of the propagator with respect to the control 
parameters, which is coupled to the Schrodinger equa- 
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tion. Moreover, we present an efficient method to evolve 
in time the gradient using a Taylor series expansion of 
the propagator [^. GOAT then uses the gradient in¬ 
formation to achieve very high accuracy, for any smooth 
ansatz. After introducing GOAT, we present applications 
to different two qubit systems, and finally discuss several 
additional advantages of this approach. 


II. OPTIMAL CONTROL PROBLEM 

The task of locating the optimal control fields may be 
viewed as the search for a global optimum of a goal func¬ 
tion within a high-dimensional control landscape. Given 
a drift Hamiltonian Hq and a number of control Hamilto¬ 
nians Hk, we endeavor to find the time-dependent control 
functions c(t), parameterized by the parameters a, such 
that the overall Hamiltonian, 

C 

H{a,t) = Ho + Y,(=k{a,t)Hk ( 1 ) 

k=l 

drives the system to the desired outcome (e.g. imple¬ 
ments a CNOT gate). The control functions are often 
parameterized as a combination of analytic basis func¬ 
tions, e.g. a Fourier decomposition 


Cfc ia,t) = E -^k,m sin (wfc “h ^A:,m) ; (2) 

f=i 
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Figure 1. Error stemming from piecewise-constant approxi¬ 
mation of smooth controls. The time-dependent control (top 
plot) and the accuracy of propagation when approximated by 
a PWC function, at various discretizations. The system is 
composed of 3 qubits with a randomly chosen drift Hamilto¬ 
nian, a randomly chosen control Hamiltonian and a randomly 
chosen goal state (see text). The accuracy of propagation is 
measured by the relative error in the computed infidelity, as 
compared to a highly accurate adaptive propagation (Runge- 
Kutta 4/5 with 10“^^ relative tolerance). The PWC approx¬ 
imation is implemented by sampling the control field at the 
beginning of the time-slice (blue and green lines), or sampling 
at the middle of the time-slice (red and cyan), approximating 
a piecewise-linear interpolation. 

Additional tasks and associated goal functions may be 
defined, such as state transfer problems, open system dy¬ 
namics, etc. See (2^ for further examples. 


or a piecewise-constant (PWC) function. In the Fourier III APPROXIIVIATING CONTINUOUS 

example the parametrization a shall be defined as CONTROLS USING PIECEWISE-CONSTANT 

FUNCTIONS SHOULD BE AVOIDED 

O = {Afc ^k,rm k=l...C,j=l...m ' 


In the PWC example, the parametrization shall be the 
value of the controls at each constant piece. The propaga¬ 
tor at the final time, U (T), is defined by the time ordered 
(T) evolution of the time-dependent Hamiltonian, 


U{a,T) 



—H (a, t) dt 


( 4 ) 


The goal function to be minimized is defined as the 
projective-S' {U) distance (infidelity) between the imple¬ 
mented gate and the desired gate, C/goai, 


3(d) 


dim (U) 




( 5 ) 


An optimal control procedure utilizing PWC controls will 
usually converge to machine accuracy with a even a rel¬ 
atively small number of time-slices. However, an experi¬ 
mental implementation that smooths over the PWC con¬ 
trol fields will achieve results far below the anticipated 
accuracy. This is a result of the discrepancy between 
the propagations induced by the optimized vs. the im¬ 
plemented controls. To estimate this discrepancy, let us 
consider a simple continuous control, of the form of eq. 
m as depicted in fig. [1] (top). We measure the accuracy 
of the PWC approximation by comparing it to a highly 
accurate reference propagation. The results are depicted 
in fig. [T] (bottom). As one can see, in order to achieve 
high accuracy, many thousands of segments are required. 
Specifically, to achieve 10“® accuracy, 10® time-slices are 
required (using middle-of-slice sampling of the control 
field). The inadequacy of the PWC approximation to the 
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task of propagating time-dependent Hamiltonians is well 
known, and has led to the development of more suitable 
alternatives 24- 261. 


Such a large number of time-slices is problematic for two 
reasons. First, it requires an optimization in a control 
space of very high dimension, many orders of magnitude 
above what is required, as discussed in the following sec¬ 
tion. Optimizing in such huge space is unnecessary and 
wasteful. Second, the optimized PWC control may in¬ 
clude high-frequency components, which may result in 
significant discrepancies when the PWC control is im¬ 
plemented experimentally as a continuous control. We 
therefore conclude that in cases where smooth controls 
are implemented experimentally, and high accuracies are 
desired, PWC approximation of continuous controls are 
prone to inefficient and imprecise propagation, and hence 
optimization. 


IV. CONTROL SPACE DIMENSION 


We now turn to several interrelated issues concerning the 
analytic parametrization of the control field: How does 
one decide on a parametrization and its dimensionality, 
and how does one minimize the possibility of introducing 
local traps in optimization process? We propose to guide 
the choice of parametrization by the following guidelines: 
First, the parametrization should be chosen to fit the ex¬ 
perimental setup and/or to offer insight into the mech¬ 
anism of the optimized pulse. Second, the dimension of 
the control space must be larger than the dimension of 
the manifold supporting the dynamics. And third, that 
while any parameterizations of finite dimension is a form 
of constraint on the possible controls, the problem of lo¬ 
cal traps can be significantly reduced using insight into 
optimization landscape topology [27 1. 


Let us begin by addressing the second issue: dimension of 
the control space. As is seen in fig. [2^, once the control 
dimension reaches the dynamic Hilbert space dimension 
(blue vertical line), optimization virtually always suc¬ 
ceeds (color of dots indicate probability of convergence, 
low probability (red) to the left of the blue line, virtual 
certainty (green) to the right). It has been our experi¬ 
ence that adding a few additional dimensions can help 
in cases where local traps are abundant (e.g. near the 
quantum speed limit). 


A more in-depth analysis of this issue is visible in fig. 
Hj, where we clearly see that for state transfer tasks 
over a Hilbert space of dimension d, a control space of 
dimension 2c? — 2 is reqnired, while for gate generation 
tasks a d? dimension is required. The data strongly sug¬ 
gests that the match between Hilbert space dimension 
and minimal control-space dimension holds true regard¬ 
less of parametrization: Squares denote a simple PWC 




Figure 2. (a) Convergence as a function of control space di¬ 
mension. Color of dots indicates probability of convergence 
to fidelity better than 10“^^, when starting at a random ini¬ 
tial point. Y-axis indicates computation time required for 
convergence (logarithmic scale). Goal is generation of a 2 
qubit random gate (with propagator Hilbert size of 16, blue 
line), using a single random control Hamiltonian. The con¬ 
trol function is parameterized by the Fourier basis, with the 
control space comprising the amplitudes, and random fixed 
frequencies and phases, (b) The minimal control space di¬ 
mension required to always succeed in a random-to-random 
state transfer task and a random gate generation task, for var¬ 
ious system dimensions. Note excellent match of the minimal 
control space dimension (dots) to the Hilbert space dimension 
(dashed and solid black lines). 


parametrization with a single control dimension per time- 
slice, while circles denote a PWC parametrization with 
flexible-width time-slices and two control dimensions per 
slice. Reference [2^ presents data in support of this hy¬ 
pothesis using Fourier parametrization for state-transfer 
tasks. This makes a strong case that the number of con¬ 
trol parameters is independent of the optimal control task 
details. 

Returning to figure |2^, we observe than when one uses 
significantly higher number of controls than is necessary. 


Optimization time [secs] 
















4 


the computational effort required to optimize rises signif¬ 
icantly. This is due to both the added numerical effort 
in calculating gradients as well as the need to search a 
much larger space, requiring many more iterations and 
overcoming the many zero-gradient redundant directions. 
It is therefore clear that one can and should aim for a 
modest control space dimension, in-line with the size of 
the dynamic Hilbert space. Note that this strengthens 
the case against PWC approximation of smooth controls, 
where the number of control parameters is tied to the 
number of time slices in the discretization. 

Any finite-dimensional parametrization is, by construc¬ 
tion, a form of constraint over the infinite-dimensional 
function space. However, it has been our experience 
that, with some exceptions, local traps are not a serious 
problem, and may be mitigated by slightly expanding 
the control space or alternating between different sub¬ 
spaces of a significantly-expanded control space (e.g. in 
the Fourier basis we can alternate between optimizing 
the amplitudes and the frequencies 113). Additional fac¬ 
tors affecting the appearance of local minima include the 
specifics of the drift and control Hamiltonians, the prox¬ 
imity to the quantum speed limit and the distance of the 
initial search point relative to the global optimum. The 
above observations leave us with a great deal of freedom 
to choose the parametrization of the control fields ac¬ 
cording to other considerations. Specifically, one would 
choose a parametrization which fits the experimental con¬ 
straints (e.g. bandwidth-limited controls which smoothly 
start and finish at zero). Alternatively, one may chose 
a parametrization which holds the promise of produc¬ 
ing an optimal control sequence from which physical in¬ 
sight can be derived (e.g. Gaussian pulses). The GOAT 
method, described below, can easily handle any form of 
constraints and parametrization which may be described 
by a piecewise-analytic function. 


V. GRADIENT OPTIMIZATION OF ANALYTIC 
CONTROLS (GOAT) 

We have seen that smooth analytic controls can have 
a significant advantage over PWC approximations in 
accuracy, optimization time, control space dimension 
and flexibility to match experimental constraints. This 
stems from the inherent inaccuracy of propagating time- 
dependent Hamiltonians using PWC approximations, 
and therefore holds relative to all previous methods that 
require a PWC approximation. Attempts to enforce an¬ 
alytical control shapes on PWC controls, using the Ja¬ 
cobian matrix to convert PWC gradients to the origi¬ 
nal analytical parametrization |l9l l29|, suffer from the 
same disadvantage - the underlying PWC approximation 
of smooth controls. What now remains to be shown is 
that gradient-based optimization of analytic controls is 


indeed possible and efficient. 

Consider a goal function g = Tr Then 

the gradient of the goal function with respect to the con¬ 
trol parameters is 

a,9 (a) = -real (t/,'„,&t/ (o, t)) ) 

( 6 ) 

where daU {a,t) is the gradient of the propagator with 
respect to the controls (for details see [23|). The focus 
then becomes the propagator gradient, daU (d, T), which 
can be computed in the following way: The equation of 
motion for the propagator is 

dtU {a,t) =-^H {a,t)U {a,t). (7) 

Therefore, the equation of motion for daU is simply 

dtdaU = ((Said) U + Hds^U) , (8) 

where we have dropped the (d, t) dependence for con- 

_ 

venience, where daH {a,t) = Y.k=i {daCk {a,t))Hk and 
where daCk{a,t) is parametrization-specific, following 
eq. m See also (s^l- Thus, we have two coupled equa¬ 
tion of motion, for U and 9at/, which may be propagated 
together as a single aggregate object 

<») 

The evolution in time of the propagator gradient may be 
performed by any mechanism for ODE integration that is 
accurate and efficient for time-dependent Hamiltonians. 
Given a parametrization a of dimension a, one will need 
to propagate a -I- 1 objects to compute the full gradient 
of the goal function. Note that the computational cost 
of a joint propagation of all components concurrently is 
significantly cheaper than evaluating each component in¬ 
dividually, and moreover may be easily parallelized. 

A good choice of propagator, given that we have access to 
the analytic form of the controls, and hence to all their 
derivatives^s the Taylor expansion of 17 (T) at t = 0, 
following [^[^. Specifically, 

OO fjik 

C/(d,T) = ^—5fH(d,f)|^^^ (10) 
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d?U = U{t) 

dtU = -^HU 

a 

d^U = -^-{dtHU+ H{t)dtU) 

E (M dru 

^ m=0 ^ ^ 

Equation [TO] can be derived with respect to the control 
parameters, and as partial differentiation commute, we 
may write 



Computational Time (minutes) 


d\ (d^u) = -1M dru 

m—0 ^ '' 

+ drOsU^. 

( 11 ) 

We have found that evolution in time using this hierarchy 
of equations is more efficient than Runge-Kutta. 

If a single Taylor expansion is not enough to span the en¬ 
tire evolution duration, the result of the previous prop¬ 
agation at U {t + dt) serves as the initial value for the 
subsequent Taylor expansion. This relay-race mechanism 
may also be used at control discontinuities in conjunc¬ 
ture with other ODE integration scheme, e.g. Runge- 
Kutta, allowing GOAT to implement piecewise-analytic 
parametrization, such as B-splines. 

Once one can efficiently compute the goal function 
derivative with respect to the control parameters, any 
gradient-driven optimization routine may be used, e.g. 
Newton methods. 


VI. EXAMPLES 


Fig. [3] shows the convergence of a CNOT quantum gate 
generation on a two qubit Ising chain. The system de¬ 
tails correspond to Problem 1 of [^. The ansatz used is 
a truncated Fourier series with 16 terms. GOAT and 
Nelder-Mead optimize here exactly the same analytic 
function, with the same initial guess of the parameters. 
One can observe that gradient-based methods are indeed 
preferable to non-gradient methods for high accuracy op¬ 
timization, when the gradient is available. 

Fig. [4] shows a GNOT generation on a two qubit NV 
center system. Each of the two controls has 5 Fourier 
components. The system details, including the normal¬ 
ization of time, correspond to Problem 15 of [^. GOAT 


Figure 3. The relative convergence behavior of GOAT (with 
Newton search) vs. Nelder-Mead. The goal is the generation 
of a CNOT gate on a two qubit Ising chain. With Nelder- 
Mead each iteration requires the evaluation of the goal func¬ 
tion. With GOAT, each iteration requires the joint evalua¬ 
tion of the goal function and a derivatives, with a being the 
parametrization dimension. Note that the computational cost 
is less than a+ 1, as joint propagation is significantly cheaper 
than evaluating each component individually. No paralleliza¬ 
tion was used. We therefore elected to measure performance 
in time required to perform the computation. Note the perfor¬ 
mance ratio between the two methods should remain constant 
regardless of hardware. 


can reach 10 accuracy with a smooth control, which 
would not be feasible with standard algorithms. 


VII. DISCUSSION 


As with many other gradient-based optimization 
schemes, GOAT can be used to optimize open quan¬ 
tum systems or goals defined on projected subsystems. 
The GOAT approach, however, holds several important 
advantages, in addition to accuracy and adaptability to 
experimental constraints and intuition with respect to 
physical mechanisms. These can be divided into the fun¬ 
damental and the technical. 


On the fundamental side, constraints may be imple¬ 
mented by mapping the optimization from an uncon¬ 
strained space to a constrained subspace, and comput¬ 
ing the gradient of the new goal function with the chain 
rule (e.g. bandwidth and amplitude constraints for a 
frequency basis or initial and final values for the control 
fields). Therefore we consider an unconstrained optimiza¬ 
tion in a constrained subspace . Second, since GOAT 
provides a convenient way to calculate the derivatives 
of the goal function with respect to the control parame- 
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Figure 4. (a) Two control pulses to generate a CNOT gate on 
NV center system, (b) Timing of the convergence of GOAT 
toward the optimal solution. 


ters, one may extend GOAT to calculating the Hessian. 
Third, GOAT allows one to produce pulses which are 
robust to changes in system characterization (e.g. the 
drift Hamiltonian), by adding the norm of the gradient 
with respect to the system parameters to the goal func¬ 
tion. Fourth, a fundamental property of GOAT is the 
flexibility of parametrization, allowing one to switch pa- 
rameterizations mid-way through an optimization. This 
could be used, for example, to avoid local traps or to ac¬ 
celerate the convergence. Fifth, by specifying equations 
of motion and not using the Pontryagin’s minimum prin¬ 
ciple, GOAT removes the need for an adjoint state, and 
its associated backward propagation. This could be very 
useful in systems where the dynamics is not reversible. 
Finally, we note the equations of motion describing a de¬ 
vice response function (e.g. bandpass filter) may be nat¬ 
urally integrated into the GOAT equations of motion, 
allowing one to account very accurately for experimental 
limitations (^ . 


On the technical side, the propagation of each gradient 
component does not depend on the propagation of the 
other gradient components, allowing for parallelization 
of the computation. Second, all high-order derivatives 
which are needed for the Taylor expansion of Eq. (HU, 
can be pre-computed automatically using a symbolic al¬ 
gebra engine. Finally, GOAT is memory efficient, as one 
does not need to store information on the gradient at 
each time step to compute the goal function. Only the 
value at the final time is required. 


VIII. CONCLUSION 


In conclusion, we have presented GOAT - gradient-based 
optimal control for analytic control functions. GOAT 
avoids the problematic PWC approximation of time- 
dependent Hamiltonians, by using a set of novel equa¬ 
tions of motion for the gradient of the propagator with 
respect to the control parameters. We believe GOAT is 
the method of choice for designing optimal pulses for ex¬ 
periments requiring smooth controls and high accuracy, 
geared toward high precision quantum technology exper¬ 
iments. 

GOAT possesses many additional advantages, mentioned 
above, which will be explored in depth in the near fu¬ 
ture. An open-source implementation of GOAT (in Mat- 
lab and QuTIP Python toolbox) is currently in develop¬ 
ment. The authors are interested in pursuing experimen¬ 
tal and theoretical applications of the GOAT approach, 
and will gladly cooperate with other interested parties 
towards this goal. 


* shai.machnes@gmail.com 
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